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Richard C. Anderson and Peter Freebody 
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Our aim in this paper is to summarize what Is known about the role of 
vocabulary knowledge in reading comprehension. Though word identification 
skills are important In reading, this paper is concerned exclusively with 
knowledge of word meanings . An assessment of the number of irjeanings a 
reader knows enabj,es a remarkably accurate prediction of this Individual's 
ability to comprehend discourse. Why this is true Is poorly understood. 
Determining why is important because what should be done to build vocabu- 
lary knowledge depends on why it relates so strongly to reading. The 
deeper reasons why word knowledge correlates with comprehension cannot 
be determined satisfactorily without improved methods of estimating the 
size of people's vocabularies. Improved assessment methods hinge, In 
turn, on thoughtful answers to such questions as what is a word, what does 
It mean to know the meaning of a word, and what Is the most efficient way 
of estimatinq vocabulary size from an individual's performance on a sample 
of words. 
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Vocabulary Knowledge and Linguistic Ability 

Measures of vocabulary knowledge are potent predictors of a variety 
of indices of linguistic ability. The strong relationship between 
vocabulary and ge.ieral intelligence is one of the most robust findings 
in the history of intelligence testing. Terman (1918), for Instance, 
reported a correlation of .91 between mental age (as assessed by the 
Stanford Revision of the BInet-Simon Scale) and the vocabulary subscale. 
On this basis he suggested that the vocabulary measure alone constitutes 
a good estimate of performance on the entire scale and thus could be 
used as a short measure. Since then, this suggestion has been tested 
with various age groups. Table 1 summarizes representative evidence. 
I.n these studies, correlations between vocabulary subtest scores and 
totcl test scores on a number of different IQ and achievement tests 
Kive ranged from .71 to .98. , 

Insert Table I about here 

An equally consi^;ont finding has been that word l<nowledge *s 
strongly related to reading comprehension. Davis OSkka^ 1968) factor 
analyzed nine comprehension tests and found a main factor for word 
knowledqe on which a vocabulary test loaded about .8. Thurstone (19'<6) 
reanalyzed Davis' original data and found three major factors: vocabu- 
lary knowledge, ability to draw inferences from a paragraph, and ability 
to qrasp the main idea of a paragraph. In the years that followed, 
several factor analytic studies identified a "reading comprehension" 
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factor (Fruchter; 19^8; Botzum, 1951; Wrigley, Saunders, & Newhaus, 1958; 
Clark, 1972). The range of factor loadings for vocabulary i:ests in these 
studies was .^1 to •93» These findings indicate the need for a central 
role for word knowledge in any model of reading comprehension. 

Analyses of readability (cf. Bormuth, 1966) also demonstrate the 
preeminent role of word knowledge. In a study of the factors that make 
prose difficult to read, Coleman (1970 examined morphological, syntactic, 
and semantic properties of words and sentences. While he found sentence 
complexity to be a fairly important variable^ he was able to conclude 
that ''any measure of word complexity (number of letters, morphemes, or 
syllables; frequency of usage) will account for about 80^ of the pre* 
dieted variance'' (p. 18^*). Klare (197^'-1975)» In a review of readability, 
also concluded that a two-variable formula Is sufficient for most practi-* 
cal purposes: one variable relates to word difficulty and the other to 
syntactic or sentence difficulty. He went on to conclude that the word 
variable is consistently more highly predictive of difficulty than is 
the sentence variable. As would be expected, some Index of vocabulary 
difficulty has typically be^n given the heaviest weight In readability 
formulas. 

Why Is Vocabulary Knowledge a Major Factor in Linguistic Abtliiy ? 
There are three more or less distinct views of why vocabulary knowledg 

is such an extraordinary correlate of linguistic ability. We will call 
the first the instrumentalist position: Individuals who score high on a 
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vocabulary test are likely to know more of the words in most texts they 
encounter than low scoring individuals* The heart of the Instrumentalist 
hypothesis is that knowing the words enables text comprehension. In other 
words, this hypothesis claims that vocabulary knowledge is directly and 
importantly in the causal chain resulting In text comprehension* Unlike 
the two positioris described below, the instrumentalist hypothesis has 
nothing to say about where vocabulary knowledge comes from, but only that, 
once possessed > it helps the reader understand text. 

According to the second position vocabulary tests measure verbal 
apt t tude > A person who scores high on such a test has a quick mind* With 
the same amount of exposure to the culture, this IndivlduaT has learned 
more word meanings. He or she also comprehends discourse more readily than 
the person who scores low on a vocabulary test. The essential claim of the 
aptitude hypothesis is that persons with Urge vocabularies are better at 
discourse comprehension because they possess superior mental agility. A 
large vocabulary is not conceived to be involved in a direct way in better 
text understanding in this model. Rather vocabulary test performance is 
HKjrely another reflection of verbal ability and it is verbal ability that 
mainly determines whether text will be understood. 

The third position is the knowledge hypothesis. Performance on vocabu- 
lary tests is seen as a reflection of the extent of exposure to the culture. 
The person who scores high has deeper and broader knowledge of the culture. 
The essential idea is that it is this knowledge that is crucial for text 
understanding. Rather than being directly Important, possessing a certain 
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word meaning is onlyja sign ihat the individual may possess the knowledge 
needed to understand a text. For instance, the child who knows the word 
mast is likely to have knowledge about sailing. This, knowledge enables 
that child to understand a text that contains sentences which do not even 
involve the word mast , such as, ''We j I'bed suddenly and the boom snapped 
across the cockpit.*' 

Of course, j i be , boom , and cockpit are specialized words,- too; It 
might be wondered whether the instrumental hypothes{,s and the knowledge 
hypothesis are really different. Strong versions of the two positions 
are distinguishable, at least. The instrumental position, as we choose 
to cht^racter ize 't, stresses individual word meanings. The knowledge 
view emphasizes conceptual frameworks or ''schemata;'* individual word 
meanings are merely the exposed tip of the conceptual iceberg* 

Which of these three positions iS most tenable? The main point to 
be made is that there are neither the theoretical tools nor the data to 
justify a conclusion at the present time. A second important point is 
that it would be naive, indeed, to assume that one of the positions will 
turn out to be entirely right and the other two entirely wrong. 

The most fully developed position is that vocabulary knowleoqe re- 
flects verbal aptitude. As the studies reviewed earlier indicate, vocab- 
ulary tests intercorrelate highly with a variety of other kinds of tests 
reflecti.iq ntel 11 gence. " On its face, this fact is hard to understand . 
solely in terms of the instrumentalist or knowledge positions. Probably 
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by metaphorical extension of notions of physical agility, It Is customary 



to speak of people of high intelligence as having '^qulck'* mkids. Recently 



Ear) Hunt and his associates have been trying to prove that this is more 
than a metaphor (cf. Hunt, 1978)'. They theorized that people of high 
verbal ability are literally faster than other people at elemental verbal 
coding and recoding operations. One task used to assess speed of mental 



subjects' deciding whether pairs of upper or lower case letters match. In 
one condition, the subject has to Judge if two letters have the same name 
(e.g., aA) ♦ and in the other condition, the decision is whether or not the 
letters are physically identical (e.g., AA). The subjects* responses are 
timed. It is argued that a time m^sure derived from this task is a pure 
index of the speed of some elemental verbal operations, since the subject 
needs to 'Mook up** in memory the names of t\\e two letters and compare them. 
Hunt and his collaborators have found that this measure correlates about 
.30 with standardized tests of verbal ability. This is a relationship 
that could not have been predicted and is not readily explained by the 
other hypotheses being entertained. 

Neverthe^l^ss, the case is far from conclusive. The general ability 
tests used in Hunt*s studies probably placed subjects under at least some 
implicit time pressure. This could have given fast workers an advantage. 
If so* the studies may have revealed that fast people are fast rather than 
that fast people are smart. Consistent with this interpretation are the 
results of a factor analysis of representative paper and pencil ability 




operations developed by Posner (cf. Posner 6 Mitchell, 1967) Involves the 
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measures and laboratory reaction time tasks completed by Hunt, Lunneborg, 
and Lewis (1975). The measures of speed of really elemental processes, 
such as letter matching time, loaded on a factor that appears to represent 
clerical speed and accuracy instead of on the factor representing general 
intelligence. A f.tudy of Xirby and Das (1977) -^so Indicated that processing 
speed is a separable factor in tests of verbal and spatial abilities. 

With respect to the instrumentalist position, as the evidence review- 
ed earlier indicates, word difficulty is highly predictive of readability. 
Does this fact clinch the argument in favor of the Inst rumefital hypothesis? 
No, since it Is possible that variation among texts In vocabulary diffi- 
culty is merely symptomatic of deeper differences In knowledge prerequisites. 
To prove that knowing the meaning of individual words has an Important 
Instrumental role in understanding text would require nx>re than correla- 
tional evidence. It would need to be shown (a) that the substitution 
of easier or more difficult words in a text makes that text easier or 
inore difficult to comprehend, ar^d (b) that people are helped to comprehend 
a text if they learn the meanings of the unfamiliar words it contaias, A 
cursory look at the literature bearing on these points suggests that the 
assumptions of the instrumentalist position are unquestioned tenets rather 
than hypotheses in need of veri f icat ion. 

There is '•ome research in which texts have been altered so as to vary 
word familiarity (see Chall, 1958, for a review of the early studies) • In 
a recent set of experiments, Wittrock, Marks, and Ooctorow (1975; sec also 
Mark<i, Doctor^, 6 Wittrock, 197^) replaced 15^ of the words in several 
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passages with either high-frequency or low** frequency synonyms. Sixth 
graders of every level of reading skill evidenced better comprehension of 
texts containing easy words than texts containing hard words^ whether they 
were reading or listening. Furthermore, children who began with an easy 
text later showed improved comprehension of the hard version of the same 
text. Performance on a vocabulary test suggested that .hiidren who had 
first received the easy version of a passage were able to iearn some of 
tho low-frequency words in the hard version. 

• Other f«ccnt evidence is less favorable to the instrumentalist- 
position. Tuinman and Brady (197^) were unable to increase fourth, fifth, 
and sixth grade students' comprehension of texts that contained a sub- 
stantial proportion of difficult words by direct instruction on those 
words, even though such instruction significantly increased the students* 
performance on the vocabulary items themselves. These authors concluded 
that the instrumental hypothesis seems to be ruled out. Jenlclns, Pany, 
and Schreck (1978; see also Pany 6 Jenkins, 1977) were also unable to 
establish that vocabulary instruction improves reading comprehension. 
Several different methods for teaching word meanings were explored. All 
were at least somewhat better than no instruction. The method which 
proved most effective with both average and learning disabled children 
involved intensive drill and practice on the words in isolation. However, 
even when children had definitely learned the meanings of twelve difficult 
words they did no better than uninstructed children; who definitely did 
not know these words, on a cloze test or in retelling a brief story 
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containing the twelve difficult words. We do not know how to reconcile 
the conflicting results bearing on the instrumental hypothesis other than 
to conclude, as reviewers of educational research must so often conclude, 
that more research is needed* 

Turning now to the third position, there is now a truly substantial 
case that background knowledge is crucial for reading comprehension (cf. 
Anderson, 1978) • However, the evidence to support the view that vocabu- 
lary scores primarily reflect such background knowledge is thin. We 
shall cite just one study which suggests that the idea is plausible. 
Steffensen, Jogdeo, and Anderson (In press) asked natives of the U.S. 
and of India to read passages describing an American and an Indian wedding. 
The results showed that the native passages were read more rapidly and 
recalled in greater detail. There were more culturally appropriate elab* 
orations of the native passages and more culturally Inappropriate distor- 
tions of the foreign ones. The vocabulary of the two passages was closely 
controlled. For instance, there were only two words in the Indian passage, 
sari and dhot i , referring to articles of women's and men's clothing, res- 
pectively, that would have been unfamiliar to any of the American subjects. 
These two words did not figure in any important way in the passage, so 
failure to know them could have had no more than a negligible effect. 
StilK a two item vocabulary test, examining knowledge of sari and dhot i , 
would have been an excellent predictor of performance on the Indian passage. 
All Indian subjects would have known both words. Some Americans would 
hdvr known sari but very few would have known dhot t . It is apparent that 
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the test would have neatly divided subjects in terms of the extent of their 
knowledge of Indian culture, which was obviously the ii^nderlying reason for 
the targe observed differences between Indians and Americans in comprehen* 
sion, learning^ and mepriory. 

f« « 

Instructional Implications of Different Hypotheses 
About Vocabulary Knowledge 
It is important to know which of^^^kn^ three , hypotheses about vocabulary 
knowledge is most nearly correct because the views have radically differ** 
ent implications for the reading curriculum* At one extreme, ' some who 
endorse the verbal aptitude hypothesis are fatalistic about whether any 

environmental factor can have a major influence on children's reading* 
They tend to recommend family planning instead of curriculum innovation 
as the final solution to the reading problem* Of course the verbal apti* 
tude position does not require the belief that heredity is predominant. 
Alternatively, there are those who maintain that verbal ability grows in 
proportion to the volume of experience with language. The greater the 
opportunities to use language the faster and more efficient become the 
elemental processing operations. In turn, speed and efficiency permit 
greater benefit from each successive language encounter. More detailed 
accounts of thi.s sort of position can be found in the well^-known paper 
by LaBerge and Samuels (197^) and a recent paper by Perfetti and Lesgold 

{ \ n press) . 
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The latter fw'^mulation of the verbal aptitude hypothesis leads to the 
recommendat ioci that educators should try to maximize the amount of reading 
children do. However, this Is not very newsworthy. It is a practice that, 
would be endorsed no matter what the theoretical persuasion* The distinc- 
tive emphasis in the verbal aptitude position is on speed and efficiency 
.> 

of processing/ This emphasis gives rise to the recommendation that begin* 
ning readers and poor readers receive extensive drill and practice on the 
'•fundamentals" of reading. According to Perfettl and Lesgold (in press), 
the drill activities should include even more practice than typically 
provided in word vocalization, more practice in speeded word recognition, 
and more practice in immediate memory for the literal content of text. 
H should be noted, that these siiggest ions are offered in the spirit of 
a hypothesis. Perfetti and Lesgold acknowledge that, so far at le^st, 
attempts to facilitate text comprehension by providing speeded word 
drills have not proved very successful (see especial 1/ Flei sher and 
Jenkins. 1977). 

While, like everyone else, the advocate of the instrumental hypoth- 
esis favors lots of reading and varied language experience, the distinctive 
feature of this view is that it invites direct vocabulary building exercise 
Becker (1977) has argued strongly for the instrumentalist position. He 
maintained that once decoding skills have been mastered, the chief remain- 
ing factor in determining whether a child will be a successful reader is 
vocabulary knowledge. He claimed that schools have never had reading pro- 
* qrams that <iystemat i cal ly build vocabulary. Children from middle class 
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backgrounds pick up word meanings anyway. But the same is less true, Becker 

.1 

argued, of children coming frorp lower class homes, which often fail to pro- 
vide support for the continuous vocabulary and concept growth important to 
school work. Consistent with this assumption is some recent work by Hall 
and Tirre (1979), v<ho found that lower class parents, p'art Icularly lower 
class Black parents, use substantially fewer of the words found in standard- 
ized intelligence tests when speaking with their children than do middle 
class parents. 

n 

Becker proposed a reading curriculum in which every child would learn 
about 7,000 basic words from direct Instruction. The figure 7,000 comes 
from one estimate of the number of basic words known by the average high 
school senior (Oupuy, 197^). Becker acknowledged that there are families 
of words with related meanings, thereby permitting the child some general- 
ization beyond the words that are specifically taught. By and large, though, 
he believed that learning one vocabulary Item gives little advantage in 
learning the next one. For Instance, he Illustrated morphological Instruc- 
tion on the following set of unrelated words: help , support , Insl st . tol 1 , 
resis t, recognize , a ssist . Even his so-called "concept side" of the In- 
struction entailed a component analysis of Isolated words. So If this 
assumption Is correct, direct teaching of a vocabulary of even 7,000 basic 
words would be an enormous task. Becker estimated that about 25 bc«slc 
words would have to be taught per week from the third through the twelfth 
grad** (p. 53^). 

f4 
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The distinctive curriculum implication of the knoviledge hypothesis 
is that generally new vocabulary ought to be learned in the context of 
acquiring new knowledge (cf. Goodman, 1976, p. '♦87)* Every serious student 
of reading recognizes that the significant aspect of vocabulary development 
is in the learninq of concepts not Just words. The odditional point that 
the knowledge position brings to the fore is that concepts come In clus- 
ters that are i>ysteinat ical ly interrelated* Returning to an earlier example, 
the concept of mast cannot be acquired independently of concepts such as 
boat and sail. Thi i, it would seem to be sensible for people to learn the 
jargon in the context of learning about sailing and the anatomy of sailboats 
According to the knowledge hypothesis. If a child we/e really naive, trying 




to teach a single sailing concept and word In Isolation from the set of 
related concepts and words would be inefficient in the best case and com** 
pletety fruitless in the worst case. 

A thought experiment suggests the more general point about the role 
of knowledge in vocabulary learning. Suppose you wished to teach some 
French vocabulary to, let us say, two groups of English-speaking Canadian 
children, evenly niatched on aptitude and achievement. One group is from 
<^ downtown urban area, the other Is from a small fishing village. The 
b<>dv of words you wish to teach is concerned with fishing ( trawlers , rods , 
no t s , ca st i n g , ba 1 1 , currents , etc.). Would you expect one group to lejrn 
the words more quickly and easily than the other? Why? We do not know 
of resoorch t hcH has dealt systematically with these questions. One some* 
wh»u rolevint study wa*> carried out by Allen and Garten (1968). They found 
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that physics students were much better than art students in recognizing 
physics words. They concluded that, for art students, physics words are 
semantical 1y indistinct and thus have to be recognized on a more piece** 
flical basis. Familiarity with an area of knowledge Increased the famil- 
iarity of the physics words. 

Knowledge can be sliced in various ways. Thus far in this section* 
we have considered sets of words related because they are used in talking 
about the same topic. Words may also be conceptualized In terms of 
families related to one another because they convey related sets of 
distinctions. Consider an example involving verbs of visual perception.^ 
The basic verb is see. If you notice that look involves a deliberate 
act of seeing, it can then be appreciated that gl impse ref ei s to a short 
act of seeing whereas glance r'efers to a short act of looking. Stare, 
on the other hand, refers to a prolonged act of looking. The var'iations 
in sense among these verbs can be understood in terms of just two semantic 
features, intention and duration. Further distinctions would be required 
to encompass other verbs of visual perception such as not i ce and examine . 

We would consider that a lesson that helped children sharpen and ex- 
tend the distinctions involved in visual perception words to be consis- 
tent with the spirit of the knowledge position. What the knowledge 
position would not countenance Is a separate vocabulary lesson that In- 
cluded glance , mast , and a miscellany of other word§. Herein lies a 
difference from the instrumental i sr. position, which does not seem to us 
to preclude exercises Involving lists of unrelated words. 
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Johnson and Pearson's (1978) book> Teaching Reading \/ocabulary » 

appears to represent predominantly the knowledge position, though it is 

an eclectic treatment that aiso reflects influences from the other two 

views* Johnson and Pearson advocated teaching a basic sight vocabulary 

using ''intensive direct instruction in the ed'rly grades and with older 

children who do not read welll">(p^ 28). They also endorsed both direct 

and indirect means for teaching phonics, r^'omoting morphological analy- 
• % 

-sis, causing vocabulary^ knowledge to expand, and ^teaching thie use of the 
dictionary and thesaurus. Johnson and Pearson devoted a chapterv to the 
use of contextual clues to' figure out the meanings of unfamiliar and 
ambiguous words. Oiherwise most of the exercises and games suggested 
throughout the book ^involve sets of words outside the context of 
stories or textbook chapters. However, the words usually involved sets 
of interrelated distinctions, such as were illustrated above with verbs 
of visual percept ion« Almost every activity was designed to expand 
children's sensitivity to these distinctions. There is an apparent 
discrepancy between the goals of .the activities, which are concerned 
with conceptual distioctions and relations, and the format of the 
activities, which is based largely on isolated words. If the knowl-- 
edge perspective were strictly adhered to, vocabulary -nstruction 
would not be thought of as a separate subject in school. 

For the sake of clarity of < <position, we have presented the apti^ 
tudet i nst rumental • and knov.'»Mlqo positions in uncomplicated and somewhat 
overdrawn form. We must emphasi/e aaain that no serious scholar in 

17 
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reading or related fields rigidly adheres to any one of these positions. 
In particular Hunt^ who has been identified with the aptitude hypothesis^ 
has explicitly and emphatically stated that vocabulary size also is a re** 
flection of an individuaPs accumulated knowledge of the world. Becker ^ 
whom we labeled an instrumentalist, heartily endorses some of the impli- 
cations of both the aptitude and the knowledge views. Reading has been 
a fractious field. If a policy were followed of avoiding controversy 
where none genuinely exists, the quality of intellectual exchange and 
the sociopolitical climate might improve to the point where someone withli 
the next decade could write a book entitled "Learning to Read: The Great 
Consensus." 

What Does It Mean to Know the Meaning of a Word? 

It is not clear that. If Ludwig Wittgenstein and Bertrand Russell 
were left alone in a room for three hours, they could decide that they ' 
really knew the meaning of dog . As Labov (1973» P- 3^0 said, "Words 
have often been called slippery customers, and many scholars have been 
distressed by their tendency to shift their meanings and slide out from 
under any simple definition." 

An ordinary adult engaging in an ordinary conversation will be 
absolutely sure he knows the meanings of almost ali of the words he 
hears. Notice that the restriction to ordinary use is an important 
aspect of this confidence. Consider the term gold , for example. The 
person who is sure he knows the meaning of this word in an ordinary use 
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win quickly retreat when In the company of jewelers, mining engineers, 
geological survey assayists, or metalurgists. 

What does a person know when he knows the meaning of a word In its 
ordinary, every-day, garden- variety sense? This issue Is addressed in 
what we will ^efer to as the Standard ^Theory of semantics, according to 
which the meaning of a word can be analyzed into features (also called 
components, attributes, or propert ies), each of which represents one of 
the distinctions conveyed by the word. Necessary or essential features 
usually distinguished from features that are irterely characteristic. 
For instance, having a back could be said to be a necessary feature of 
chai r since an object that is otherwise a chair except for the lack of 
a back is really a stool instead of a chair. On the other hand, the 
ability to fly is only a characteristic feature of bi rd since some birds 
(penguins) don^t fly at all and others (chickens) do so very poorly. 

To define a term, in the strong sense, is to list the features 
necessary to capture the essence of the thing (or event or quality) 
designated by the word. Saying this anothei'^^^ay , a proper definition 
indicates the attributes a thing must have in order to be designated by 
a word; if any of these necessary properties were missing that word 
would not apply. Before we choose this as our criterion in the testing 
of children's word knowledge, however, we might wish to examine how well 
it applies to adults' normal use and understanding of words. 

How able are people to define the words they are sure they know? ''Not 
very*' is the answer if one insists upon the strong sense of define. Conside 
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gold again. Upon being as^-.ed to define gold , the ordinary citizen might 
say that gold (a) is precious, (b) is a metal, and (c) that it has a par- 
ticular yellowish (i.e., golden) hue. The problem Is that none of these 
is a necessary feature. Not all gold is a golden color. If, say, the 
Chinese were to discover a nountaln of gold, the substance would no longer 
be precious. Mot even the attribute of being a metal can be considered 
to be an eternal, iimnutable property of gold for, unl lively though it is, 
there might be a scientific breakthrough in which It was discovered that 
gold is not a metal. 

A unicorn is a beast with such and such defining characteristics. 
Of course there are no beasts with these properties; which Is to say 
that unicorns do not exist. .By the same logic. If being precious and 
being a metal are defining features of gold. It follows that If the 
Chinese were to discover a mountain of the substance or scientists were 
to determine that the substance Is not a metal, one would be forced to 
conclude that gold did not exist. As Putnam (1975) has noted, this is 
a very odd conclusion, because there would still be this "stuff" lying 
around that people used to call gold. We have a right to be suspicious 
of a semantic theory that backs us Into such a peculiar corner. 

Another example will Illuminate the point even more starkly. When 
it comes to fine points of meaning, ordinary folks turn to experts as 
the final arbiters--to jewelers and metalurgists for the exact meaning 
of gold , to the Supreme Court for the proper interpretation of words in 
the Constitution, and so on. For the sake of the argument, it may be 
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supposed that the American Psychiatric Association Is the final arbiter 
of the meaning of homosexual . For years, this august group defined 
homosexuality as a disease of sexual orientation. Recently, however, 
the association declared that homosexuality Is not a disease. Anita 
Bryant may not have agreed with that conclusion, but at least she 
understood it. if the characterization of homosexuality as a disease 
had been taken seriously as a defining feature, upon ' recons Idering its 
position, the American Psychiatric Association would have had to 
assert, ••There is no such thing as homosexuality." That conclusion 
would simply have left Ms. Bryant puzzled. 

There are other serious problems with Standard Theory. Notably, 
the members of a class called by the same name frequently do not all 
share a single set of conmon properties. Wittgenstein (1953; see also 
Rosch, 1973; Rosch"& Mervis, 1975) argued that things designated by the 
same word generally are related by "family resemblance." He Intended 
an analogy to a human family whose members look and act alike. Mother 
and one son may have a prominent nose. Father and daughter may have 
the same hair color. And so on. But there may be no single respect in 
which they are all alike, no single feature which they all share. 
Wittgenstein claimed family resemblance was the most accurate char** 
acterization of the relationships among the various uses of most Qommon 
words. To Illustrate his point, he analyzed uses of the term game , 
noting the similarities and differences between team games, board games. 
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and children's games. Others have shown the fuzziness and context sensitivity 
of the meanings of terms such as cup (Labov, 1973) » eat (Anderson & Ortony, • 
1975), red (Halff, Ortony, & Anderson, 1976), and held (Anderson, Pichert, 
Goetz, Schallert, Stevens, 6 Trollip, 1976). 

A great deal more could be said about semantic theory. (For author- 
itative, current treatments, see Clark 6 Clark, 1977» especially chapters 
11-1^4; Fillmore, 1975; and Miller & Johnson-Lal rd, 1976.) The main point 
of this brief excursion into the meaning of meaning Is to caution against 
holding up a standard of word comprehension for children that adults could 
not meet. 

Depth of Word Knowledge 

It is useful to distinguish between two aspects of an Individual's 
vocabulary knowledge. The first may be called ^'breadth'* of knowledge, by 
which we mean the number of words for which the person knows at least 
some of the significant aspects of meaning. Later sections of thisNpaper 
will be concerned mainly with breadth of knowledge. 

Treated in this section is a second dimension of vocabulary knowl- 

r. 

edge, namely the quality or "depth" of understanding. We shall assume 
that, for most purposes, a person has a sufficiently deep understanding 
or a word if it conveys to him or her all of the distinctions that would 
be understood by an ordinary adult under normal circumstances. 

Eve CUrk (1973) has marshalled an array of evidence which shows that 
the meaning a young child has for a word is likely to be more global, less 
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differentiated than that of an older person. With increasing age, the 
child makes more and more of the adult distinctions. In other words, when 
first acquired, the concept a child has for a word need not Include all 
of the features of the, adult concept. Eventually, In the normal course 
of affairs, the missing features will be learned* 

While there are some differences In theoretical Interpretation and 
syie findings appear to hinge on procedural details (Brewer & Stone, 1975; 
Glucl<sberg, Hay, S Danks, 1976; Richards, 1976; Nelson, 1977), most of the 
research done to date supports the conclusion that there Is progressive 
differentiation of word meanings with Increasing age and experience. 

Just one Illustration will be provided of the kind of evidence that 
points to this conclusion* Centner (1975) completed a theoretical analysis 
of v^rbs of possession which Indicated tbat buy , s el 1 , and spend ental 1 
a more complex set of distinctions than give and take . Notice that giving 
involves the transfer of something from one person to another* Selling 
likewise Involves the transfer of something from one person to another* 
but It involves an additional transaction as ^ell, the transfer of money 
from the buyer to the seller. The complimentary relationship holds 
between buying and taking* 

Centner expected children to acquire the full, adult meanings of 
these verbs in order of complexity. Children ranging from four to eight 
years of age were asked to make dolls act out transactions from directions 
Involving each verb. For example^ the children were requested to ''make 
Ernie sell Bert a (toy) car.'* Tfie four-year-olds performed flawlessly 
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with directions containing give and take , but never correctly executed 
instructions that involved spend, buy, or sel 1 . The eight-year-olds 
exhibited nearly perfect understanding of every direction except the ones 
containing sel 1 . Overall, the results were exactly as expected^ the 
adult meanings of verbs of possession are acquired In order of complexity. 

Centner's analysis of the children's errors suggests that the younger 
ones treated the complex verbs as though they were simpler forms. She 
explained (p. 2^2)" ... the commonest incorrect response was some form 
of one-way transfer . . . th§ youn^' child acting out buy and sell com- 
pletely disregards the money transfer that should be part of their 
meanings, yet performs the object transfer In the correct direction. He 
reacts to bu^ as If It were take . He treats sel 1 as if It were give ." 
When asked to "make Bert spend some money" even the youngest child cor- 
rectly handles the money transfer, but he neglects to have Bert get 
anything for the money he "spends." The child treats spend money as 
though It meant give money away. 

Through some quirk of the sociology of science, the in-depth study 
of word knowledge has been the special province of psycholinguists 
studying language development in young children. There is a substantial 
body of literature on selected vocabulary of children from about two 
through eight years of age. The literature involving older children and 
adults is meager. 
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In our judgment, peoples' vocabulary knowledge continues to deepen 
throughout their lifetimes; that Is, as they grow older, most people 
continue to learn nuances and subtle distinctions conveyed by words that 
in some sense they have known since childhood. There is no hard data to 
,v support this conjecture* However, an illustration will show that many 
adults still have something to learn about even fairly common words* It 
is. easy to find educated adults who confuse infer and imply * A person 
wilj say something along the lines, Intended, by stating these argu- 
ments, to infer that Of course, this individual should have 
sai.d imply . Speakers imply: Listeners infer. The complication, which 
no doubt makes the distinction difficult, is that speakers may report 
inferences they have made as well as get implications across to listeners* 

Breadth of Wor.i Knowledge 

V 

It is disturbing to examine available estimates of the average 
vocabulary size of various age groups. Table 2 summarizes studies. that 
have been carried out to estimate total basic or 'Voot'^ word knowledge. 
It can be seen that the estimates vary wildly. 

Insert Table 2 about here 

it IS not obvious how to evaluate the different sampling methods 
and response criteria that have been employed in research attempting to 
estimate vocabulary size. Recently, for instance, the distinguished 
psychonnquist , George Miller (1978), stated: 
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Although the rapid rate of syntactic acquisition has Inspired much 
respectful discussion in recent years, the rate of lexical growth 
Is no less impressive. The best figures available indicate that 
children of average' Intelligence learn new words at a rate of more 
than 20 per day. It seems necessary to assume therefore, that at 
any particular time they have hundreds of words roughly categorized 
as to semantic or topical relevance but not yet worked out as to 
precise meaning or use. (p. 1003) 

Miller did not specify whether or not he was referring to "basic" 
words. If he was, then he Is positing a mean annual word acquisition rate 
of over seven thousand words, or about fifty thousand over the elementary 
and middle school years. This seems unlikely even In the light of the 
highest estimates summarized In Table 2. He may have been Including 
(Jompounds and der I vat Ives. However, to our knowledge, no systematic ex- 
amination of children's ability to understand these forms has been 
completed. Miller's statement highlights two points: First, In its. 
original context, the statement is a crucial step In an argument about 
lexical development. Accurate estimates of thc'growth of word knowledge 
are an important element in discussions of lexical and conceptual de- 
velopment and the relationship betwe^ them. Second, how do we assess 
what are the "best figures available?" 

In \3'*0, Seashore and Eckerson remarked that, even, though the field 
of vocabulary testing is a "fairly old one" (p. 35) » substantial 
problems of measurement remained. By now» In the time span of educa- 
tional research, we might want to call the field "ancient," and virtual Iv 
all of those oriqinal problems persist. 
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There are important practical reasons for attempting to make accurate 
assessments of total word knowledge. Language and reading programs aim to 
increase students* vocabularies. The number of words presented to students 
varies^ in part, according to what is regarded as the most authoritative 
thinking and research on vocabulary size and growth (Clifford, 1978). Hore 
reliable estimates would indicate the appropriateness of the assumptions 
of a program, and perhaps highlight periods of growth to be capitalized 
upon. ^ More generally, reliable estimates would indicate whether direct 
language instruction can plausibly account for a substantial proportion 
of the child's language growth, or whether word knowledge is acquired for 
the most part independently of. formal instruction. To refer again to a 
concrete proposal, Becker^ s (1977) idea that underachieving children 
should be taught via direct instruction the vocabulary most high school 
seniors possess would be difficult, but perhaps feasible, if the children 
had to learn 23 new words a week. It would be out of the question if 
they had to learn 25 words each school day. 

Next we will present some of the central issues in broad**gauged 
measurement of word knowledge. The discussion of these issues will re^ 
^veal many of the reasons why estimates of vocabulary size have fluctuated 
so widely. Two general questions need to be considered. First, how. is 
a sample of words to be selected? Second* what kind of response from a 
subject will be regarded as evidence that a word is tn the individual's 
vocabulary ? 
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Selecting a Sample of Words 

In determining what is to count as a word, the researcher needs to 
decide whether or not it is of Interest to discern the subjects' ability 
to use derivatives and compounds (plurals, participles, tense markers, 
comparatives, etc.). Some authors, notably Seashore (1933), have pre- 
ferred to calculate separate estimates for "special" terms and derivatives. 
Others, for example Dupuy (197^). have attempted to concentrate solely on 
"basic" words. Dupuy, the author of one of the most recent and tho»-ough 
studies of word knowledge, sampled randomly from Webster's Third New 
International Dictionary (1961) and then applied three criteria to each 
word selected. The word had to be a main entry, a single word form (i.e., 
not a derivative or compound), and could not be technical, slang, foreign, 
or archaic. 

The systematic nature of this sampling creates its own equally 

•j 

systematic biases. Some children may have acquired the generative rule, 
for, say, negation by prefix, for example, unable or dishonest , and others 
may not have (Silvestri S S.ilvestri, 1977). Do we wish to exclude this 
element of vocabulary knowledge from the measure? Adults acquire a number 
of special or technical terms in their areas of expertise or interest, so 
exclusion of technical terms denies many subjects the opportunity of Indi- 
cating their knowledge of a large number of words. 

What counts as a word will depend upon the researcher's principal 
purposes. However, affixes and derivatives are important elements of 
word knowledge, and several questions related to their role are of 
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considerable interest: In what way does knowledge of basic or root word 
forms relate to knowledge of the compound forms? Are entries organized ^ 
conceptually in the personal dictionary such that the probability of 
knowing a compound word Is the same as that of knowing all its family 
members, basic form included? Or Is the chance of knowing a compound 
some combination of the frequencies of the particular compounding 
elements? Much is to be gained from research into these Issues. % 

Whatever criteria are appl ied» there can be no doubt that there are 
many thousands of words In English. Dupuy (197^) estimated that there 
are about a quarter of a million main entries In Webster's Dictionary ^ 
(1961 ). Of these, he calculated that about 12,300 are basic words. 

A source and method of selecting from that source is required which 
will lead to the most accurate estimates of total word knowledge. The 
most obvious wa/ to start is to sample randonly from an unabridged 
dictionary* Dupuy (197^)f for instance, selected one word from every 
page of the dictionary (the third word from th^ top of alternating 
columns)fand then applied the three criteria mentioned earlier for 
selecting the basic words out of this group. This procedure produced 
a final sample of 123 basic words. 

Once a random sample of words has been selected, a test Is con-- 
strucled to assess how many of the words a person knows. Then, in 
princlpa!, estimating the person's vocabulary size Is straightforward. 
For instance* Oupuy's Basic Word Vocabulary Test contains \% of the 12,300 
basic words he calculated are in Webster' s. Therefore, the absolute size 
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of the basic word vocabulary can be approximated by multiplying the score 

on this test by 100. A person whose score is 60, after correction for 

guess.it>q, would be judged to have a basic vocabulary of 6,000 words. 

One disadvantage of this method is self-evident. Estimated vocabu* 

lary size depends heavily on the size of the dictionary. With respect 

to Dupuy, while he sampled initial ly from a large unabridged dictionary, 

a word had to appear as a major entry in each of threie other smaller. 

dictionaries in order, to be counted as a basic word. A total of 9/9 

words, k\% of the. sai(nple, were discarded on the basts of this rule. The 

result was a very conservative estimate of the number of basic words in 

American English and is one reason Dupuy's estimates of basic vocabulary 

size are so much smaller than those of other investigators. Of course, 

many of theSe words were very rare, but others such as cloudlet ♦ escaping , 

breezes, invited, starling, and uni lateral would be familiar to most 
' 

people. / 

Already discussed is the issue of what to do witi> derivative and 
compound forms. A liberal policy will lead to large estimates of vocabu- 
lary size. A conservative policy will produce smaller ones. Dupuy was 
conservative. He eliminated 7.7% of the words in his sample on the grounds 
that they were compounds or derivatives, including a great many familiar 
^nes. such as grandchi Id , paclcage , and toothache . 

There are other, more subtle considerations in selecting a random 
sample of words from a dictionary. Some procedures for sampling from an 
unabridged dictionary can introduce systematic error since all entries do 
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not occupy the same amount of space on a page. This disproportion typically 
favors the words in more common use since these are the most elaborated, 
particularly in an unabridged dictionary where very many derivatives may 
be listed (Williams, 1932; Lorge & Chall, 1963). Consequently, while the 
words may seem to have been , randomly selected, the frequency distribution 
of the sample may be substantially different from that of the population. 
This may partly account for the very large estimates of Seashore and 
Eckerson (19^0) and Smith (19^1). 

A further problem is that projecting a vocabulary size from performance 
on a random sampling of words Is inefficient, tf the subject provides the 
meaning of bi bulous, then using up test time by asking for the meaning of 
bicycle is wasteful. When estimating subjects^ total vocabulary size is 
the researcher's major aim, then efficiency of items covered per unit of 
examinee time is an important consideration. 

One obvious response to these problems is to select the sample from 
a frequency distribution of words. Terman and Merrill (1937) arranged 
their sample of words in order of *'di f f i cul ty.'* When the subject failed 
at six consecutive words^ the vocabulary test was stopped* Oupuy (197^) 
recommended a similar procedure. Time can be saved by such a procedure, 
but vocabulary size is likely to be underestimated. Furthermore, heavy 
stress is placed on the assumption that the frequency distribution of the 
sample mimics that of the population. If this assumption fails^then 
multiplication of the subject's score by the appropriate constant will 
produce a poor estimate of total words known. 
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The characteristics of the two major, current, word frequency compi- 
lations available (Carroll, Davles, 6 Rlchman, 1971; Kui^era & Francis 
1967) suggest a potential problem with frequency sampling. These analyses 
indicate that the distribution of words Is highly unbalanced, a conclusion 
reached oyer 25 years ago by Horn (195^), who calculated that about 2,000 
types will account for about 95% of '^running words In adult writing;" 
3,000 for 96.9%; ^.000 for 97.8%; and 10,000 for 99.^%. At the low fre- 
quency end of the scale, there Is a tail that approaches Infinity. Even 
in a huge corpus, a vast number of words appear only once, twice, or not 
at all. Of the 86,741 word types listed ^y Carroll, Davies, and Rlchman 
from a corpus of over 5 million tolcens, 35,079, or 40.44%, appeared 
once. Kucera and Francis found 44.72% of the words appeared once 
In a sample of over one million tokens. So, If the test Is short, the 
subjects run the risk of not being able to show that they know several 
medium frequency words, since there will be such a large proportion of 

rare words in the sample. A resolution of this Issue is Important, since 

» 

a frequency-based sampling technique seems the most accessible method for 
overcoming the problems of simple random sampling. 

Frequency is a parameter which probably is very strongly related to 
probability that a word will be known. There is evidence supporting this 
hypothesis from a number of areas: multiple choice performance on stan- 
dardized tests (Kibby, 1977), recall of word meanings fol lowi ng presentat ion 
of pictures (Carrol & White, 1973; Duncan, 1977), and word recognition times 
following tachi stoscopic presentation (Rubenstein, Garfield, SMillikan, 
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1970; Cohen, 1976). The only discrepant finding has been that of Davis (194Ab) 
who found only a slight relationship between word difficulty and frequency. 
He explained this result in terms of the role of compound words: While the 
root of the word may be very common and we1)-l^nown, a certain affix-root 
compound may be very infrequent, but almost equally well-known if the affix 
is familiar. A more «^na1ytic approach to the relationship of this index of 
frequency of usage to probability of knowledge would entail the use of 
••family** frequency, that is, the frequency of the root word and all its com- 
pounds and derivatives. We might expect that the relationship of this index 
of frequency of usage to probability of knowledge would be more orderly. 

Indeed, we are willing to go further and speculate that the relation- 
ship between family frequency nnd probability of knowing a word resembles 
the curve presented in Figure I. In terms of breadth of knowledge, we 
would expect a ceiling at the upper end of the frequency scale: most 

insert Figure 1 about heife 

people know all of the very common words. Other aspects of the curve would 
differentiate individuals: the point at which the curve dropped from the 
plateau level, and the slope of the function probably are the two para- 
meters that would capture the important individual differences. Even for 
children, we might best think of the curve leveling out as the words become 
very infrequent, since it is likely that, from their hobbies, interests, 
or the occupation of their parents, most children would know some very rare 
words. Nevertheless, we have drawn the lower portion of the curve as a 
broken line since we are less sure about the relationship in this area. 

33 
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In summary, a good test of word knowledge would present the subject 

with a large number of words, sampled liberally from the whole range of 

vord frequency. Techniques should be developed which allow accurate esti- 

mation of the relationship of a given subject's probability of knowing a 
* 

worrd and the frequency of the^rd's morphological family. 

Criteria For Determining That a Word Is in a Person's Vocabulary 

Four sorts of test formats have been employed In attempts to assess 
breadth of vocabulary knowledge: (a) multiple choice; (b) constructed 
answer in which the subject attempts to give a definition, a synonym, an 
illustration, or use the word in a sentence or phrase; (c) yes/no Judgments, 
in which the subject checks the words in a list that he or she l<nows ; and 
(d) matching where the subject pairs off words with their synonyms. Sims 
(1929) compared these four types using data obtained from students in 
fifth through the eighth grades. The correlation matrix Sims reported 
is reproduced in Table 3. Sims concluded that, although the checking 
method was as reliable as the others, it did not seem to offer acceptable 

Insert Table 3 about here 

construct validity. Only seventy words were used, however, and Sims failed 
to counterbalance for order or delay between tests. While there may be 
some questions about the trustworthiness of Sims' results, there Is In- 
tuitlvc sense in the notion that the constructed answers* multiple choice, 
and matching tasks have more In common with one another than they have with 
a checking task that is not corrected for guessing. 
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The question that needs examination is which of these methods will 
be of most theoretical and practical value as a measure of vocabulary* 
Three of these types wilt be discussed in the light of several issues. 
Since the points raised about the multiple choice format apply even more 
cogently to matching^ the latter will not be dealt with separately. 

Multiple choice methods . People often possess partial Icnowledge 
of words. In these instances the items' distractors become crucial. An 
individual may select the correct synonym for pl atitude from the choices: 
(a) duck*bi lied mammal > (b) praise^ (c) commonplace remarl^^ (d) flatness. 
He may malce the correct selection because he has heard the word used In 
reference to an utterance and with a negative connotation. This informa- 
tion» however, may not enable him to select correctly from (a) commonplace 
remark, (b) nonsense, (c) irrelevant question, (d) insult. The set of 
choices constrains the Individual's response to different degrees, and 
different policies for generating distractors will, of course, lead to 
differences In performance. 

Lepley (I955f 1965) ♦ for example, constructed two forms of a synonym 
test, one employing distractors from the same semantic category as the 
target, and another which used distractors from semantical ly diverse 
categories. Lepley (1965) found equal split-half reliability (.93 and 
94) but only a .66 correlation between performance on the two scales, 
and significantly superior performance on the version requiring only 
gro^s discriminations. The correlation is surprisingly low given the 
common format and the fact that the superficial demand characteristics 
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were the same. Lepley's results illustrate the irfluenceof the d is* 
tractor set* 

The multiple choice format is currently the moit widely used in 
standardized vocabulary testing (e.g,, Stanford Achievement Tests» 1373; 
Metropolitah Achievement Tests, 1970; California Achievement Tests, 1977)* 
The principal complaint raised here so far is chat the distractors cannot 
avoid constraining the subject's response. If the purpose of the test is 
to provide ddt:i on relative performance only, not on absolute level of 
performance, then the distractors can be, and usually are, chosen to 
maximize the discriminating po^^^r of the item. If one is interested in 
vocabulary size, then this policy will not do. 

Many vocabulary tests (e.g., Stanford, 1973) use sentence complet-ion 
in a multiple choice format. Many of the problems already mentioned apply 
even when the test simulates a real encounter with the target word. In 
addition, the question of the effects of various amounts of contextual 
support on estimated vocabulary size, with groups of words that vary in 
frequency of usage, has not been studied. There is research that suggests 
that individuals vary not only in the size of their reading vocabularies 
but also in their ability to use context to deduce the meanings of unl<nown 
and partly known words (Pearson 6 Studt, 19/5; Mason, Knisely, 6 Kendall, 
1973). 

A tricky problem with the multiple choice format is that young 
children may not consider all the distractors (Asher, 1978; Brown, 1975; 
Vurpillot, 1968). They will often choose the first or second alternative 

J 
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if it makes reasonable enough sense. The test- taking strategies of older 

« 

children on multiple choice tests are not yet wel 1 characterized, but 
there quite probably are strategic components of good performance which 
serve to increase spuriously the relationship between a multiple choice 
vocabulary test and other achievement or Intelligence tests in the same 
format. An insidious possibility is that some of the apparent growth 
in vocabulary knowledge over the elementary schtiol years is really attrib 
utable to the acquisition of more sophisticated test-taking skills. 

In conclusion, the multiple choice format Is the most popular one. 
It makes relatively efficient use of examinee time and must be reasonably 
valid, otherwise the strong relationships between performance on such 
tests and other measures of linguistic competence, summarized at the 
beginning of this paper, would not have been obtained. The chief compli- 
cation with the multiple choice format, when one wants absolute measures 
oT vocabulary knowledge. Is how to choose dlstractors. A further problem 
is that multiple choice tests may make demands on strategic knowledge In 
which young and poor readers are deficient. 

Constructed answer measures . To overcome the problem of selecting 
dlstractors, several researchers, notably Seashore (1933) • Smith (19^1), 
Terman and Merrill (1937) » have used a constructed answer format. In 
which the subject reads or hears the target word and then writes or 
teJIs a definition of it, uses It In a sentence, gives a synonym for It, 
or in some other way provides an Indication of Its sense and reference. 
Subjects can be encouraged to do any one of these things just so long 
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as the experimenter Is convinced the word is ••known. This format is 
capable of dealing with a variety of levels of knowing a word and avoids 
the issue of distractors. There are» however^ two substantial problems 
with constructed answer measures: The problem of scoring the answers 
and the problem of response bias. 

in the written format » in particular^ a constructed answer measure 
is confounded by factors such as spelling ability^ sentence construction 
ability, and even the ability to write legibly, all of which may discourage 
a subject from elaborating on a word used or understood in conversation. 
A slightly more subtle problem^ and one that is more difficult to control, 
resides in the fact that, if a liberal criterion is used and the subject 
is allowed a range of possible responses to a target word, then a par<- 
ticular strategy for rc^sponding may be adopted. The problem is that some 
words would be more easily explicated in a particular form. The word 
noun may be more easily explained through i llustratil'on than by definition, 
for instance. The research of Anglin (1970) and Wolman and Bal<er (1965) 
indicates that, up to the age of about 10-12 years, children tend to pro- 
vide concrete definitions-by-illustration rather than by an inclusive 

« 

term or synonym. Ik is entirely possible that, depending on scoring 
criteria^ the preference at a different age for certain explanatory 
strategies could produce spurious estimates of the rate of vocabulary 
growth. 

A really vexing problem is how liberally to score answers. How 
does one score synonyms in relation to apt illustrations or perfect 
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usage in a sentence? In many Instances^ partial knowledge Is displayed. 
In one of our own recent testing sessions^ it became clear that many 
fifth grade students had partial knowledge of the word forbid s Several 
students knew that It had something to do with not being permitted to do 
something but did not have as part of their knowledge the fact that 
forbid is used in imperative speech acts. We soon realized thdt» In this 
case» we needed to ask for Its use in a sentence^ We have found other 
more subtle and difficult cases of partial knowledge. For the word 
propel led » there was no problem in the students' recognition of the word 
because of their knowledge of propel ler . When probed about the function 
of a propel ler» many came close to generating the notion of propulsion on 
the theory that it would be strange to have a big round blade going 
around on the front of a plane unless it served some fairly fundamental 
purpose'-^-and what planes do is move. 

Some words have no near- synonyms. There are other instances when 
the only synonym is a less frequent word than the target. In such cases» 
;ne subject is being asked to produce a rare word in order to show that 
a common word is known. 

There are some alr.ost irresistible tendencies displayed by an examiner 
when administering a test with a constructed answer format. After a few 
children have been tested, the examiner develops a sense of which words 
are easy and which are difficult. It requires conscious effort to avoid 
expecting more explanation of the difficult words and less for^-the easy 
words. If ever9 subject has known chai r and the current subject pats 
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the seat of his stool as a response, thert the tendency Is to award full 
mark's. If he pats the wall for edifice , however, he might not score so 
well. Similarly there is an urge to expect CK>re elaborated responses 
from older subjects. The preschooler who tells you that an automobi le 
"goes brrrrrmmm" will strike you more favorably than the college sopho- 
more who gives you the same answer. In addition, the experimenter will 
witness explanations of words which entail subtle nonverbal as well as 
verbal cues. Young children typically employ hand movements, facial 
expressions, and gesture in their communications especially when dealing 
with words that are a little difficult for them. 

The horns of the dilemma are these. Stringent, operational, adult- 
like standards for evaluating whether a response indico.es a word is 
known will confound what is supposed to be a measure of breadth of 
vocabulary knowledge with expository ability. Looser, more flexible 
standards will confound the. measure with the subjective judgment of 
the examiners which may change from word to word, subject to subject, 
and occasion to occasion. 

So the liabilities of the constructed answer method are both logis** 
tical and substantial. It Is Inefficient per unit of testing and scoring 
time, and it seems to rely on often subtle Intuitions on the part of the 
examiner* espejciatly when the subject displays partial knowledge of an 
I tern. 

Yes /no format > The final format to be considered Is that of •'checking, 
which v4€ prefer to term a yes/no method. In this format the subject simply 
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indijcates whether or not the meaning of ^ word Is known. Two of the major 

difficulties that have arisen Consistently in the discussion of the other 

two major formats are the problem of response bias» and the need to present 

the subject with a large number of words chosen from a wide frequency range. 

The checking format can satisfy the second criterion admirably but problems 

of validity arise. Sims (1929) concluded: 

The writer is Inclined to believe that a good guess as to 
whether or not a child knows the meaning of a word is almost 
as satisfactory a method of determining vocabulary as checking 
tests. The relative simplicity o.f such a measure^ the ease of 
preparation and admirtlstration should not blind one to its ^ 
invalidity, (p. 96) 

Chdil and Dale (1950) reported that the average tendency to overestimate 

word knowledge in the yes/no format over and above the definition format 

amounted to about 11%> and was more pronounced for rare words. 

It ought to be no real surprise, that a yes/no test uncorrected for 
decisions in the face of partial knowledge would give inflated estimates 
of vocabulary size and would correlate poorly with other measures. 
Consider the yes/no task from the point of view of the test taker. Some 
individuals may deny that they know the word gold beca^ise they do not 
ki^w Its atomic weight, while others will agree they know it because 
they have a feeling that It can be used to refer to a color. 

The problem of correcting yes/no test scores for guessing is not 
insuperable. Stating the issue more precisely, guessing is only part of 
the problem. The real issue, as the gold example illustrates, is one of 
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el.lmlnatlcg voriatton In the degree of confidence different individuals 
must have before they are willing to say, ''Yes, I l<now that word." 

iSignal detection theory (Swets, 1964) affords a concei^al and compu- 
tational frameworl( that may allow estimation of amount of word knowledge 
independent of judgmental standards. This theory was originally developed 
for use in psychophysical experimentation. In this setting, typically 

» 

the subject is informed that he will hear short burs^t of backgroynd noise 
and that there may be a. tone sounded as well. T})ft subject's task is to 
report whether or not a tone (the signal) "was present. Research has 
established that it is possi bl|9iil%cr get a very accurate estimate of a 
person's capacity to detect the signal by correcting for whatever tendency 
he or she has to report "hearing** the signal when it is not actually there. 
Pastore and Scheirer (197^) have sunvnarized research showing that this 
paradigm can be applied to the analysis of a broad range of perceptual and 
cognitive tasks* With respect to vocabulary assessment, the work of 
Zimmerman and others (1977) has suggested that, by using close-to-Engl i sh 
nonsense letter strings as the ''noise only" stimuli, signal detection 
methods might be applied to word knowledge. 

We are currently analyzing data collected from elementary and high 
school subjects on large numbers of words. The students responded yes 
or no to a mixture of many English words and almost as many nonsense 
words* Later they completed standardized multiple choice questions on 
the real words. Our preliminary analyses have indicated that yes/no 
scores adjusted according to signal detection theory » and other correc- 
tions for guessing and risk-taking, correlate highly with multiple 
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choice perfonnance. We later interviewed the subjects Individually about^ 
a subset of the words. The data suggest that a value derived from the 
yes/no task gives a better estimate of true word knowledge than performance 
on the standardized multiple choice test. 

\ 

The fact that words have multiple mea\)ings poses a problem for the 
yes/no task, since presumably a person will check 'yes" if he or she knows 
any meaning of a word. This is not a small problem. According to Love 1 1 
(l9^Jl), k3% of the words used by^Seashore and Eckerson (1940) had multiple 
meanings. Recently, Balch (cited in Johnson & Pearson, 1978» p. 17) has 
reported that from 23% tp k2% of the words in six widely used basic vocabu- 
lary lists have multiple meanings. In other recent research. Mason, 
Knisely, and Kendall (1973) have shown that children are much less 
likely to know the secondary than the primary meaning of words used in 
their secondary sense in a popular basal series. It is apparent that the 
yes/no format Is not suitable for distinguishing which of the meanings of 
a vord are known. When that is the goal, some other method of assessment 
is required. 

In summary, the great attraction of the yes/no format is that it 
permits the presentation of a very large number of words In a given inter- 
val of examinee time. Compared to the multiple choice format. It reduces 
somewhat the burden of preparing distractors and, compared to constructed 
answer formats, it side steps vagaries of scoring. The notable problem 
with the yes/no task Is that scores of individuals will be influenced 
markedly by differences In tendency to take risks In the face of uncertainty. 
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tf this problem can be solved, the yes/no task might be very useful for 
assessment of breadth of word knowledge. 

Cone 1 us ion 

While currfent research demonstrates the importance of such factors 
as a reader's perspective on a text (Pichert & Anderson, 1977) «^nd text 
structure (Meyer, 1975; Handler & Johnson, 1977), It Is also clea» that 
word knowledge is a requisite for reading comprehension: people who do 
not know the meanings of very many words are most probably poor readers. 
There are serious gaps in our^understanding of why this is true and of 
how word knowledge grows throughout the life span. Filling those gaps 

>- 

promises to be both an intellectual and a practical challenge of con* 
siderable importance. We judge that a critical first step is the de- 
velopment of improved methods of assessing breadth of vocabulary knowledge, 
It is only after some refinement has been achieved at this level that 
models of lexical development and instructional programs can be based 
on realistic expectations about the acquisition of word meiinings. 

We conclude our review of vocabulary knowledge and vocabulary size 
with the realization that, since the turn of the century, a tremendous 
amount of energy has been put into answering the quest Ion, "How many words 
does an individual know?" We have come to wonder if this question Is 
properly framed. The nature of language, may make it unanswerable and 
thus, for scientific purposes, irrelevant. Empirical methods may be 
obie to generate useful indices such as that discussed earHer--the 
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relationship of the jndividudPs knowledge of wcrds to word frequency. 
However, to produce a single value from performance on a sample to 
represent total vocabulary size may be an exercise that relies too 
heavily on the assumptions of a static population of isolated words 

and on an overly restrictive view of how we generajte and use words in 

** 

context « 
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Footnote 

We are indebted to Charles Fillmore for this example. 



Table I 

Correlations of Various Vocabulary Tests With Tests of General Intelligence 



Vo/tAhii 1 A rv/ 
w\/\#ai/vi 1 o 1 y 

Measure 


Measure 


Subjects 


i *■ 


r_ 


Source 


Terinan, 1916 


Blnet (1916) 


School children 


631 


.91 


Terman (1918) 


Terman, 1916 


BInet (1916) 


School chi Idren 


269 


.87 


Mahan & Wltmcr (1936) 


German, 1937 


Blnet (1916) 


School children 


65 


.92 


Spache (19^3) 


Terman, 1937 


Blnet (1916) 


School children 


1161 


.98 


ElMood (1939) 


Terman, 1937 


Blnet (1916) 


School children 


753 


.86 


White (19^2) 


Terman, 1937 


Blnet (1916) 


Standardization 
sainpiey ages 
>* 18 


710 


.71 

lO 

.86 


McNemar (19^2) 


Wechs 1 er 


Wechsler 


Adul t males 


1000 


.82 


Lewlnskl (19^8) 


Wechsler 


Wise 


Standardization 
sample, ages 
7.5, 10.5, 13.5 


600 


.71 
.87 
.78 


Wechsler (19^9) 


Raven 


Binet 


School children 


150 


.93 


Raven (1948) 


Oupuy 


Various tests 


School children 


2397 


.76 


Dupuy (1974) 




Stanford 
Achievement , 
Tests (1973) 

(vocabulary 
with total 
achievement 
test scores) 


Standardization 
samples 
Grade 2 

3 
k 

5 
6 
8 


275,000 
over 
grades 
and geog. 
1 oca 1 e 


.82 

.79 
.80 
.80 
.83 

.89 


Stanford Achievement 
Test (1973) 



Note . Adapted from Miner, (1957). 
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Table 2 

Some Previous Estimates of Total Vocabulary Size 
at Selected Grades 



Grade 


Sou rce 


Est 1 mate 


1st 


M. E. Smith (1926) 

* 


2,562 




uoicn \i^jO/ 


2,703 


• 


Ames (196i() 


12,400 


• 


H. K. Smith (1941) 


17.000 




Shibles (1959) 


26,000 


3rd 


• 

Dupuy (197^) 


2,000 




Hoi ley (1919) 


3,144 




Terman (1916) 


3,600 




Brandenburg (1918) 


5,429 




i\ 1 rKpd t r 1 CK V 1 «f u / / 


d>d20 




Tuff MQ^n) 






N. K. Smith (1941) 


25 000 


7th 


Oupuy (1974) 


4,760 




Terman (I9I6) 


7,200 




Hoi ley (1919) 


8,478 




Ni rKpacricK 


10,000 




Brandenburg (1918) 


11,445 




Cuff (1930) 


14,910 




Bonser» et aK (1915) 


26,520 




M. K. Smith (19^1) 


51 ,000 


Col lege 


Seashore (1933) 


15,000 


sophomore 


Kirl<patrick (1907) 


19,000 




Seashore & Eckerson (19^0) 


60,000 




Gerlach (1917) 


85,300 




Gillette (1927) 


127,800 




Hartmar (IS'^S) 


200,000 


Note* 

1976. 


Adapted from Seashore and Eckerson, 


1940, and Bayer 
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Table 3 

Correlations Between Four Types of Vocabulary Tests 







1 


2 


3 


4 


1 . 


Checking (yes/no) 


.92* 








2. 


Mui tipie choice 


.54 


M* 






3. 


Matching 


.64 


.85 


.93* 




i». 


Constructed answer 


. .56 


.74 


.82 


.92* 



Noti." From Sims (1929 ). 



Split'half reliability coefficients. 
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Figure Caption 

Figure ]. Possible relationship between lilcelihood word meanings 
are known and frequency of usage. 
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